knitr::opts_chunk$set(echo =TRUE, message=FALSE, warning=FALSE)options(warn =-1)rm(list=ls())library(tinytex)library(ggplot2)library(table1)library(gt)# test these packages# just for testing any conflicts# library(tidyverse)# library(plyr)# library(dplyr)# library(glmnet)library(survival)library(data.table)library(randomForest)library(grf)library(policytree)library(DiagrammeR)library(grid)library(forestploter)library(randomizr)codepath <-c("Rcode/forestsearch/R/")source(paste0(codepath,"source_forestsearch_v0.R"))source_fs_functions(file_loc=codepath)source("Rcode/kmplotting/R/KMplotting_functions_v0.R")# Save output flagoutput <-FALSE# Loading resultsloadit <-TRUE# Record time of all analysestALL.start<-proc.time()[3]#library(doParallel)#cat("Number of cores=",c(detectCores()),"\n")# Note: leave workers unspecified# on my linux machine workers(=100) needs to be setplan(multisession)#plan(multisession, workers=120)nsims <-1000nboots <-1000
1 Summary
We evaluate the potential for utilizing subgroup analyses for enhancing regional consistency analyses in oncology multi-regional clinical trials (MRCTs). We consider a setting where asia pacific (AP) countries consist of approximately \(15-20\%\) of a global trial and the criteria for consistency is a positive point estimate (Cox model hazard-ratio \(< 1.0\)) while the global trial meets statistical significance.
Suppose the following –
ITT meets statistical significance but the AP regional analysis appears problematic (hazard-ratio estimate \(\approx 1.0\))
Can a subgroup with enhanced benefit be identified in the non-AP population which then translates to the AP population?
That is, suppose there exists a subgroup \({\cal G}\) such that the treatment affect is more pronounced and (standard) Cox model analyses based on the \({\cal G}\) subgroup of AP (AP\(({\cal G})\), say) enables meeting consistency.
We assume Cox model analyses are the primary analysis where only the treatment arm is included as a covariate (with or without stratification by randomization factors).
While there are a plethora of subgroup identification approaches available we consider the forestSearch approach, León et al. (2024), which was developed for identifying strong subgroup treatment effects (i.e., strong HTEs).
We consider three objectives: (A) Identify a subgroup with the largest detrimental effect (harm, say) defined in terms of the largest hazard-ratio estimate exceeding a clinical threshold in favor of control (e.g., \(\hat\beta \geq \log(0.90)\)); (B) Identify the smallest subgroup with a hazard-ratio estimate indicative of harm (e.g., \(\hat\beta \geq \log(0.90)\)); and (C) Identify the largest subgroup with strong benefit (e.g., \(\hat\beta \leq \log(0.60)\)). These objectives are related with the overarching goal of identifying a subgroup with enhanced benefit; In (A) and (B) we identify a subgroup indicative of harm, \(\hat{\cal H}\) say, in which case the complementary subgroup (\({\hat{\cal H}}^{c}:= {\hat{\cal G}}\)) may be generally beneficial. Whereas in (C) we are more directly targeting a strong (clinically compelling) beneficial subgroup effect.
In this note we provide a brief review of the Weibull model in Section 2 where we describe a (2-phase) spline model that we utilize for inducing differential treatment effects by way of a ‘’biomarker’‘. Section 3 provides examples of how to generate’‘biomarker profiles’’ of interest where one can define subgroups based on biomarker cutpoints which correspond to underlying causal hazard ratio effects (e.g., causal hazard-ratio effect for biomarker \(\geq 2\) is approximately \(0.5\), say). Section 4 describes the simulation setup (data generating model/process) and generates an example dataset used for analysis illustration where in Section 5 we discuss details of implementing (A)-(C) above. In Section 6 we evaluate the operating characteristics for objective (B) under conditions where there exists a strong biomarker effect, and under the null of a uniform treatment benefit (i.e., \({\cal G} = \emptyset\)). Lastly, Section 7 briefly discusses estimation (de-biased point and CI estimates) – taking into account the identification algorithm – where there is interest in inference regarding the population from which the subgroup was identified (e.g., non-AP).
Throughout this note R code is provided and illustrated based on real trial data (subject-level baseline characteristics are fixed at their observed quantities). Our hope is that these notes can serve as a framework for future trial planning as well as for simulating from study data to understand the operating characteristics for teams and external discussions.
Subgroup identification analysis based on non-AP data to identify subgroups of marked benefit in AP. Operating characteristics evaluates whole process starting from analysis option
2 Weibull AFT/Cox model with biomarker effects
2.1 Brief Review
For the Weibull distribution with shape parameter \(\nu\) and scale parameter \(\theta\) the density, cdf, survival, hazard and cumulative hazard functions are: \[\begin{eqnarray*}
f(t) &=& \nu \theta^{-\nu}t^{\nu-1}\exp(-(t/\theta)^{\nu}), \cr F(t)
&=& \int_{0}^{t}\nu \theta^{-\nu}s^{\nu-1}\exp(-(s/\theta)^{\nu})ds
\cr &=& \int_{0}^{(t/\theta)^{\nu}}e^{-w}dw \cr & &
(w=(s/\theta)^{\nu},dw=\nu s^{\nu-1}\theta^{-\nu}ds) \cr
&=&1-\exp(-(t/\theta)^{\nu}), \cr S(t) &=& \exp(-(t/\theta)^{\nu}),
\cr \lambda(t)&=&\nu \theta^{-\nu}t^{\nu-1}, \cr \Lambda(t) &=&
-\log(S(t))=(t/\theta)^{\nu}.
\end{eqnarray*}\] Note that here we define the density to correspond with R’s definition. For shape parameter \(\nu \in (0,1)\) the hazard is strictly decreasing in \(t \geq 0\), whereas for \(\nu >1\) the hazard is strictly increasing in \(t \geq 0\).
Note: \(\Lambda(T) \sim E(1)\)
The cumulative hazard function \(\Lambda(\cdot)\) evaluated at \(T\), \(\Lambda(T)\), as a random variable has cdf \[\eqalign{ \Pr(\Lambda(T) \leq t) &=\Pr(-\log(1-F(T)) \leq t) =\Pr(1-F(T) \geq e^{-t}) \cr
&=\Pr(T \leq F^{-1}(1-e^{-t})) =F(F^{-1}(1-e^{-t})) = 1-e^{-t}, \cr}\] which is the CDF of the exponential distribution, \(E(1)\) (say).
In the following we use
\[\begin{equation}
\tag{1}
\Lambda(T) \sim E(1)
\end{equation}\] to represent the Weibull regression model as an AFT model which is also a Cox model.
Now, \(\Lambda(T)=(T/\theta)^{\nu}\) and write \(Q=-\log(S(T))=\Lambda(T)=(T/\theta)^{\nu}\), where from (1), \(Q \sim E(1)\). That is \(\log(Q)=\nu(\log(T)-\log(\theta))\) can be expressed as
\[\begin{equation}
\tag{2}
\log(T)=\log(\theta)+ \tau \log(Q) := \log(\theta) + \tau \epsilon,
\end{equation}\] where \(\tau=1/\nu\) and it is easy to show that \(\epsilon=\log(Q)\) has the ``extreme value’’ distribution with density \(f_{\epsilon}(x)=\exp(x-e^{x})\) for \(x \in {\cal R}\). Here the range of \(\log(T) \in {\cal R}\) is un-restricted. The survReg routine uses the parameterization \((2)\) and therefore estimates \(\log(\theta)\) and \(\tau=1/\nu\).
To incorporate covariates \(L\) (say) we specify \[\lambda(t;L)=\big(\nu \theta^{-\nu}t^{\nu-1} \big) \exp(L'\beta)
:= \lambda_{0}(t)\exp(L'\beta),\] where \(\lambda_{0}(t)\) is the hazard, say, for \(T_{0} \sim \hbox{Weibull}(\nu,\theta)\). This is a special case of the proportional hazards model. The chf (conditional chf with covariate vector \(L\)) is \(\Lambda(t;Z)=(t/\theta)^{\nu}\exp(L'\beta)\) so that analogous to above this leads to the representation
\[\begin{equation}
\tag{3}
\log(T) =\log(\theta)+\tau[-L'\beta+\epsilon] =\log(\theta)+L'\gamma + \tau \epsilon,
\end{equation}\] where \(\gamma=-\tau\beta\), with \(\tau\) and \(\epsilon\) defined in (2). R survReg uses this AFT parameterization so that the estimated components of \(\gamma\), \(\gamma_{p}\) say, are that of \(-\tau\beta_{p}\) for \(p=1,\ldots,m\) (\(m\) is dimension of \(\beta\)).
When fitting the AFT model (3) via suvreg we therefore transform parameters \(\hat\gamma\) to the Weibull hazard-ratio parameterization (2) via
As an illustration we compare the survReg model fits for the case-study dataset. The following table below compares the Weibull survReg model fits with covariates Treat and Ecog1 (Ecog = 1 vs 0) where components of \(\hat\gamma\) from model (3) are calculated according to survReg and \(\hat\beta\) are formed via (4). In the table below Weibull estimates of \(\hat\beta\) are compared to Cox model versions.
# Comparing Weibull vs Cox with case-study where outcomes are artifical# This is just for illustration to show conversion of Weibull parameters from # AFT regression to Weibull hazard fit.weib_ex <-survreg(Surv(tte,pmax(event,1)) ~ treat + ecog1, dist='weibull', data=df.case)tauhat <- fit.weib_ex$scale# convert (treat,ecog1) regression parms to Weibull hazard-ratiobhat.weib <--(1)*coef(fit.weib_ex)[c(2,3)]/tauhat# Compare to Cox fit.cox_ex <-coxph(Surv(tte,pmax(event,1)) ~ treat + ecog1, data=df.case)res <-cbind(bhat.weib,coef(fit.cox_ex))res <-as.data.frame(res)colnames(res)<-c("Weibull","Cox")res |>gt() |>fmt_number(columns=1:2,decimals=6) |>tab_header(title="Comparing Weibull to Cox hazard ratio estimates",subtitle="Case-study dataset with artificial outcomes")
Comparing Weibull to Cox hazard ratio estimates
Case-study dataset with artificial outcomes
Weibull
Cox
0.157144
0.176033
0.472355
0.472196
2.2 Biomarker effects with spline model
We now outline how potential outcomes are simulated according to parameters fit to the case-study dataset but with parameters specified to induce biomarker effects. That is, causal treatment effects (on log(hazard-ratio) scale) that follow a spline model according to patterns where biomarker effects increase with biomarker levels; Including various degrees of limited treatment effects for low biomarker levels.
We first consider a Weibull model with treatment and a single biomarker covariate \(Z\) where we write the linear predictor of the Cox model \(L'\beta\) (say) as
Following the potential-outcome approach let \(l_{x,z}\) denote subject’s hazard-function “had they followed treatment regimen \(Treat=x\) while having biomarker level \(Z=z\)”. That is, for subject with biomarker level \(Z=z\) we can simulate their survival outcomes under both treatment (\(x=1\)) and control (\(x=0\)) conditions. Let \(\beta^{0} = (\beta_{1}^{0},\ldots,\beta_{5}^{0})'\) denote the true coefficients and denote the hazard function as
\[\begin{equation}
l_{x,z} = \beta^{0}_{1}x + \beta^{0}_{2}z + \beta^{0}_{3}zx + \beta^{0}_{4}(z-k)I(z>k) +
\beta^{0}_{5}(z-k)I(z>k)x,
\end{equation}\] the log of the hazard ratio for biomarker level \(z\) under treatment (\(x=1\)) relative to control (\(x=0\)) is given by
The log(hazard-ratio) for biomarker level \(z\) is a linear function of \(z\) with a change-point (in slope) at \(z=k\) given by
\[\begin{eqnarray*}
\psi^{0}(z) &=& \beta^{0}_{1} + \beta^{0}_{3}z, \quad \hbox{for} \ z \leq k, \cr
&=& \beta^{0}_{1} + \beta^{0}_{3}z + \beta^{0}_{5}(z-k), \quad \hbox{for} \ z > k.
\end{eqnarray*}\]
Log hazard-ratio parameters \((\beta^{0}_{1},\beta^{0}_{3},\beta^{0}_{5})\) can be chosen to generate “treatment effect patterns” by specifying \(\psi^{0}(z)\) values at \(z=0\), \(z=k\), and \(z=\zeta\) for \(\zeta > k\). For specified \(\psi^{0}(0)\), \(\psi^{0}(k)\), and \(\psi^{0}(\zeta)\) we have
The function get_dgm_stratified generates \(\psi^{0}(z)\) according to desired “biomarker treatment effect patterns” as follows.
Let \(X\) and \(Z\) denote the treatment and biomarker variables in the case-study dataset and for specified \(k\), form the covariates \(L:=(X,Z,ZX,(Z-k)I(Z>k),(Z-k)I(Z>k)X))\);
Fit the Weibull model (recall on AFT scale) to get \(\log(\hat\theta)\), \(\hat\tau=1/\hat{\nu}\), and \(\hat\gamma\) corresponding to \(L\);
\(\hat\gamma\) is in terms of the AFT parameterization given by model (3)
Next transform to the Weibull (Cox) log hazard-ratio parameterization (4): \(\hat\beta = -\hat\gamma/\hat\tau\)
Set “true” parameters \(\theta^{0}=\hat\theta\), and \(\tau^{0}=\hat\tau\)
Initialize the “true” parameter \(\beta^{0} = \hat\beta\) and re-define parameters 1, 3, and 5 in order to satisfy specified \(\psi^{0}(0)\), \(\psi^{0}(k)\), and \(\psi^{0}(\zeta)\): \(\beta^{0}[1] = \psi^{0}(0)\), \(\beta^{0}[3] = (\psi^{0}(k) - \beta^{0}[1])/k\), and \(\beta^{0}[5] = (\psi^{0}(\zeta) - \beta^{0}[1] - \beta^{0}[3]\zeta)/(\zeta -k)\);
Form corresponding \(\gamma^{0}= -\beta^{0}\tau^{0}\)
For simulations we use the AFT parameterization (3) to generate \(\log(T)\) outcomes according to \(\log(T) = \log(\theta^{0}) + L'\gamma^{0} + \tau^{0}\epsilon\) where recall \(\epsilon\) has the ``extreme value’’ distribution.
The inputs of the get_dgm_stratified function are:
The case-study dataset (“df”)
“knot”=\(k\), “zeta”=\(\zeta\), and “log.hrs”= (\(\psi^{0}(0),\psi^{0}(k),\psi^{0}(\zeta))\)
Note that get_dgm_stratified allows for outcomes to follow a stratified Weibull (Cox) models in which case the log(hazard-ratio) effects will depend on the strata, “strata_tte”, where the default is non-stratified (“strata_tte=NULL”)
If stratified the \(\tau\) parameters and hence \(\gamma^{0}\) depend on the strata
If stratified, then to simplify, the \(\gamma^{0}\) effects are calculated based on the median of the \(\tau^{0}\)’s (\(\tau^{0}=\tau_{med}\), say).
To-do –> explain outputs and how used in draw_sim_stratified …
2.4 Example where treatment effects increase with increasing biomarker
We assume that \(z=0\) indicates biomarker negative values with \(z>1\) indicating positive levels. As an example, suppose the log(hazard ratio) at \(z=0\) is \(\psi^{0}(0)=\log(3)\) and decreases linearly for \(z \leq 5\) (with slope \(\beta_{3}^{0}\)) such that at \(z=5\), \(\psi^{0}(5)=\log(1.25)\) and for \(z>5\) decreases linearly (with a change in slope and intercept) such that at \(z=10\), \(\psi^{0}(10) = \log(0.5)\). In the following we will describe summary measures of the treatment effects as a function of increasing [or decreasing] values of the biomarker. In this example treatment is detrimental for lower values of the biomarker with treatment effects increasing fairly quickly as the biomarker increases with an overall ``average hazard ratio’’ (AHR) of \(\approx 0.74\) (see Working Example below).
2.5 Including prognostic factors \(W\)
We extend the model to include a baseline prognostic factor \(W\) within \(L\) (effect parameter \(\beta^{0}_{w}\)) where
Note that defining \(m(1,z,w)\) [\(m(0,z,w)\))] as the conditional expected value of \(\log(T)\) with treatment set to experimental [control] given biomarker level \(Z=z\) and \(W=w\)
\[\psi^{0}(z,w) = \{m(0,z,w)-m(1,z,w)\}/\tau\] which corresponds to the difference in the (true) means of the potential outcomes under experimental and control conditions (\(\log(T[0,z,w]\) versus \(\log(T[1,z,w]\), say)
3 Specifying biomarker treatment effects
3.1 Average hazard ratios (AHRs)
We define the biomarker average hazard ratio (AHR) as the expected value of \(\psi^{0}(\cdot)\) across “increasing biomarker” sub-populations. For example, \(\hbox{AHR}(2^{+})\) represents the AHR for subjects with biomarker values \(\geq 2\) via
Here the expectations are over the distribution of \(Z\), however if there is a prognostic factor \(W\) then the expectations are over \((Z,W)\). In our calculations we calculate empirical averages.
3.2 Controlled direct-effect (CDE) versions (Aalen, Cook, and Røysland (2015))
Averaging across hazard ratios: Define the hazard (omitting baseline hazard) \(\theta^{x}(z,w) = \exp(l(x,z,w))\) setting treatment to \(X=x\) given \(Z=z\) and \(W=w\). Aalen et al. (Aalen, Cook, and Røysland (2015)) define the controlled direct-effect (CDE) as the ratio of the expected hazards. Here we consider the above cumulative versions (omitting possible dependence on \(W\)). Let \(\bar\theta^{x}(z+) = E_{Z \geq z}\theta^{x}(Z)\) for \(x=0,1\), and define
df.case <-within(df.case,{ z <-pmin(biomarker,20) # eg, PDL-1 expression truncated at 20 CTregimen <-ifelse(mAD_CTregimen1==0,"socA","socB") # Outcome stratification (Weibull/Cox model) })# These are confounders (fixed) that were characteristics from observed dataconfounders.name <-c("age","biomarker","male","EU_region","AP_region","histology","surgery","ecog1","mAD_CTregimen1")
3.3 Compare ‘AHR’ versions under various biomarker specifications
3.3.1 Under PH (uniform effects, hr=0.7), with strong prognostic effect \(\beta_{w}= -\log(5)\)
Code
# Outcome model is NON-stratified (strata_tte=NULL)# PH log.hrs <-log(c(0.7,0.7,0.7))dgm <-get_dgm_stratified(df=df.case,zeta=10,knot=5,log.hrs=log.hrs,strata_tte=NULL,details=FALSE)# Simulate with randomization factors "stratum" # Note that strata_tte="CTregimen" is a simplification (collapsing over region and metastatic) of "stratum"# xname="AP_region" specifies prognostic effect factor# bx=log(5) denotes log(hazard ratio) relative effect with respect to xname factor# Draw very large-sample to get approximation to bias# Note: return_df will NOT return the large simulated dataset but the "large-sample" estimates as a list# checking=TRUE will compare the estimated tau (stratified) parameters based on the case-study dataset to # the versions based on the simulated dataset# details=TRUE will output the approximations to Cox model estimators from several models ("population summaries")# return_df =FALSE will return the "population summaries" as a list as well as the biomarker AHR functional (see below)pop_summary1 <-draw_sim_stratified(dgm=dgm,ss=1,Ndraw=100000,wname="AP_region",bw=-log(5),strata_rand="stratum",checking=TRUE,details=TRUE,return_df=FALSE)
3.4 Calculating causal ‘AHR’ effects for subgroups
To evaluate the estimation properties of our procedures we calculate the empirical average within subgroup \(\cal G\) (say) which can represent the estimated subgroup or can be the ITT population (or any underlying subgroup). That is, for each subject with characteristics \({\bf L}=(Z,W)\) say, we have their individual log hazard-ratio contrast \(\psi^{0}({\bf L})\); We estimate \(\hbox{AHR}({\cal G}):= \exp\left\{E_{\cal G} \psi^{0}\right\}\) by calculating the empirical average
\[\begin{equation}
\tag{8}
\hbox{AHR}({\cal G}):= \exp(\bar\beta({\cal G})), \quad \hbox{where} \quad \bar\beta({\cal G}) =(1/n({\cal G}))\sum_{g \in {\cal G}} \psi^{0}({\bf L}_{g}),
\end{equation}\] with \(n({\cal G})\) denoting the size of \({\cal G}\).
4 Working example for subgroup identification strategies
In the remainder we will be evaluating the performance of subgroup identification procedures when the underlying biomarker profile (``differential treatment effect mechanism’’) is quite strong and challenges with establishing consistency within the AP region are induced by a strong prognostic effect (\(\beta_{w}=-\log(5)\)).
Note that in the following we draw a very large sample (\(n = 100,000\)) to approximate the asymptotic properties of Cox model analyses based on ITT and AP region subgroups (non-AP is denoted \(W=0\), and AP is \(W=1\)). The code outputs the overall AHR which represent (8) applied to the ITT population (i.e., \(\cal G\)=ITT) as well as (8) applied to non-AP and AP subgroups (“AHR W=1”, and “AHR W=0”, resp.).
Code
# Outcome model is NON-stratified (strata_tte=NULL)# Strong biomarker log.hrs <-log(c(3,1.25,0.50))dgm <-get_dgm_stratified(df=df.case,zeta=10,knot=5,log.hrs=log.hrs,strata_tte=NULL,details=FALSE)# Simulate with randomization factors "stratum" # Note that strata_tte="CTregimen" is a simplification (collapsing over region and metastatic) of "stratum"# xname="AP_region" specifies prognostic effect factor# bx=log(5) denotes log(hazard ratio) relative effect with respect to xname factor# Draw very large-sample to get approximation to bias# Note: return_df will NOT return the large simulated dataset but the "large-sample" estimates as a list# checking=TRUE will compare the estimated tau (stratified) parameters based on the case-study dataset to # the versions based on the simulated dataset# details=TRUE will output the approximations to Cox model estimators from several models ("population summaries")# return_df =FALSE will return the "population summaries" as a list as well as the biomarker AHR functional (see below)# Strong prognostic factor pop_summary <-draw_sim_stratified(dgm=dgm,ss=1,Ndraw=100000,wname="AP_region",bw=-log(5),strata_rand="stratum",checking=TRUE,details=TRUE,return_df=FALSE)
# An illustrative simulated example datasetdf_example <-draw_sim_stratified(dgm=dgm,ss=123,wname="AP_region",bw=-log(5),strata_rand="stratum")# Subgroups identified with nonAP populationdf_nonAP <-subset(df_example,AP_region==0)# Applied to APdf_AP <-subset(df_example,AP_region==1)df_ITT <- df_example# True causal AHRahr_true =c(round(pop_summary$AHR,4))# AHR Cox un-adjusted (only treatment)ahr_unadj =c(round(pop_summary$ITT_unadj,4))# Treatment plus randomization stratificaton sRahr_sR =c(round(pop_summary$ITT_sR,4))# sR including adjustment by xahr_sRw =c(round(pop_summary$ITT_sRw,4))# Relative biases bias_unadj <-round(100*c(pop_summary$ITT_unadj-pop_summary$AHR)/pop_summary$AHR,1)bias_sR <-round(100*c(pop_summary$ITT_sR-pop_summary$AHR)/pop_summary$AHR,1)bias_sRw <-round(100*c(pop_summary$ITT_sRw-pop_summary$AHR)/pop_summary$AHR,1)# The bias within X=1 populationahr_w1 =c(round(pop_summary$W_1,4))bias_w1 <-round(100*c(pop_summary$W_1-pop_summary$AHR)/pop_summary$AHR,1)
Code
plot_AHRs(popsummary=pop_summary,dfcase=df.case)
Overall Population (large-sample) –
The overall true (causal) average hazard-ratio (AHR) = 0.7368
The \(\approx\) asymptotic limit of un-adjusted Cox (only treatment) = 0.7904
The \(\approx\) asymptotic limit of stratified Cox (treatment and stratified by stratum) = 0.7641
The \(\approx\) asymptotic limit of stratified Cox with adjustment by \(x\) = 0.7458
The relative over-estimation biases for the Cox model estimates when un-adjusted, stratified (by randomization factors), and stratified + adjusted by the strong prognostic factor \(X\) are: 7.3%, 3.7%, and 1.2%, respectively.
4.1 Dilution of marginal Cox model within AP region
Moreover, within the AP_region (\(X=1\)) the \(\approx\) limit of the Cox model estimator based on the (AP_region) regional local data is 1.016 with corresponding bias of 37.9%.
We would therefore expect challenges with establishing consistency based on evaluating the local regional data.
Lets check if there is any imbalance (asymptotically) by biomarker by drawing a large sample (N=100,000) and viewing summary tables:
Code
# This time return large dataset (corresponding to that used in the above asymptotic approximations)dflarge <-draw_sim_stratified(dgm=dgm,ss=1,Ndraw=100000,wname="AP_region",bw=-log(5),strata_rand="stratum",checking=FALSE,details=FALSE,return_df=TRUE)# create factor versions of treatment and AP regiondflarge$trtsim <-ifelse(dflarge$treat.sim==1,"Experimental","Control")dflarge$apregion <-ifelse(dflarge$AP_region==1,"AP","non-AP")dflarge$bm_lt2 <-ifelse(dflarge$biomarker<2,"biomarker<2","biomarker>=2")table1( ~ biomarker + bm_lt2 | trtsim, data=dflarge, caption=c("ITT by treatment arm"))
ITT by treatment arm
Control (N=50001)
Experimental (N=49999)
Overall (N=100000)
biomarker
Mean (SD)
7.83 (7.06)
7.82 (7.07)
7.82 (7.07)
Median [Min, Max]
5.00 [0, 20.0]
5.00 [0, 20.0]
5.00 [0, 20.0]
bm_lt2
biomarker<2
11930 (23.9%)
11990 (24.0%)
23920 (23.9%)
biomarker>=2
38071 (76.1%)
38009 (76.0%)
76080 (76.1%)
Code
table1( ~ biomarker + bm_lt2 | apregion, data=dflarge, caption=c("ITT by AP region"))
ITT by AP region
AP (N=24472)
non-AP (N=75528)
Overall (N=100000)
biomarker
Mean (SD)
7.95 (6.87)
7.78 (7.13)
7.82 (7.07)
Median [Min, Max]
5.00 [1.00, 20.0]
5.00 [0, 20.0]
5.00 [0, 20.0]
bm_lt2
biomarker<2
4922 (20.1%)
18998 (25.2%)
23920 (23.9%)
biomarker>=2
19550 (79.9%)
56530 (74.8%)
76080 (76.1%)
Code
table1( ~ biomarker + bm_lt2 | trtsim, data=subset(dflarge,AP_region==1), caption=c("AP region by treatment arm"))
AP region by treatment arm
Control (N=12237)
Experimental (N=12235)
Overall (N=24472)
biomarker
Mean (SD)
7.96 (6.87)
7.94 (6.88)
7.95 (6.87)
Median [Min, Max]
5.00 [1.00, 20.0]
5.00 [1.00, 20.0]
5.00 [1.00, 20.0]
bm_lt2
biomarker<2
2442 (20.0%)
2480 (20.3%)
4922 (20.1%)
biomarker>=2
9795 (80.0%)
9755 (79.7%)
19550 (79.9%)
Code
rm("dflarge")
Next, a simulated example with same sample size as the case-study
Here we simulated a study according to the biomarker effects described above
Summarize the non-AP and AP populations
Apply subgroup identification procedures to the non-AP data and apply to AP
# This time return large dataset (corresponding to that used in the above asymptotic approximations)df <- df_example# create factor versions of treatment and AP regiondf$trtsim <-ifelse(df$treat.sim==1,"Experimental","Control")df$apregion <-ifelse(df$AP_region==1,"AP","non-AP")df$bm_lt2 <-ifelse(df$biomarker<2,"biomarker<2","biomarker>=2")table1( ~ biomarker + bm_lt2 | trtsim, data=df, caption=c("ITT by treatment arm"))
ITT by treatment arm
Control (N=255)
Experimental (N=252)
Overall (N=507)
biomarker
Mean (SD)
8.29 (7.45)
7.35 (6.63)
7.82 (7.06)
Median [Min, Max]
5.00 [0, 20.0]
5.00 [1.00, 20.0]
5.00 [0, 20.0]
bm_lt2
biomarker<2
59 (23.1%)
61 (24.2%)
120 (23.7%)
biomarker>=2
196 (76.9%)
191 (75.8%)
387 (76.3%)
Code
table1( ~ biomarker + bm_lt2 | apregion, data=df, caption=c("ITT by AP region"))
ITT by AP region
AP (N=125)
non-AP (N=382)
Overall (N=507)
biomarker
Mean (SD)
7.88 (6.87)
7.81 (7.13)
7.82 (7.06)
Median [Min, Max]
5.00 [1.00, 20.0]
5.00 [0, 20.0]
5.00 [0, 20.0]
bm_lt2
biomarker<2
25 (20.0%)
95 (24.9%)
120 (23.7%)
biomarker>=2
100 (80.0%)
287 (75.1%)
387 (76.3%)
Code
table1( ~ biomarker + bm_lt2 | trtsim, data=subset(df,AP_region==1), caption=c("AP region by treatment arm"))
AP region by treatment arm
Control (N=62)
Experimental (N=63)
Overall (N=125)
biomarker
Mean (SD)
8.42 (7.13)
7.35 (6.62)
7.88 (6.87)
Median [Min, Max]
5.00 [1.00, 20.0]
5.00 [1.00, 20.0]
5.00 [1.00, 20.0]
bm_lt2
biomarker<2
10 (16.1%)
15 (23.8%)
25 (20.0%)
biomarker>=2
52 (83.9%)
48 (76.2%)
100 (80.0%)
4.3 Looking at 9 simulated AP trials when AP region is a strong factor (\(\beta_{w} = -\log(5)\)) and null (\(\beta_{w}=0\))
4.3.1 Nine simulated AP realizations under \(\beta_{w}= -\log(5)\)
4.4 Compare non-parametric (cubic-spline) fit with true \(\psi^{0}\)
For ITT dataset
Code
# ydel (default=0.25) is how much room to provide in space for countstemp <-cox_cs_fit2(df=df_example,tte.name="y.sim",event.name="event.sim",treat.name="treat.sim",strata.name="strata.simO",z.name=c("z"), details=FALSE ,boots=0, xlab=c("Biomarker"),maxz =25, ydel=0.5, show_plot=TRUE,truebeta.name="loghr.po")title("ITT")
aa <-paste(confounders.name, collapse=" + ")aa <-paste0("~ " ,aa)aa <-paste0(aa," | treat_name")aa <-as.formula(aa)table1( aa , data=df_nonAP, caption=c("Simulated dataset: non-AP (training data)"))
Simulated dataset: non-AP (training data)
Control (N=189)
Experimental (N=193)
Overall (N=382)
age
Mean (SD)
59.2 (13.0)
60.0 (11.6)
59.6 (12.3)
Median [Min, Max]
62.0 [23.0, 87.0]
61.0 [22.0, 83.0]
61.0 [22.0, 87.0]
biomarker
Mean (SD)
7.34 (6.98)
8.26 (7.27)
7.81 (7.13)
Median [Min, Max]
5.00 [1.00, 20.0]
5.00 [0, 20.0]
5.00 [0, 20.0]
male
Mean (SD)
0.730 (0.445)
0.751 (0.433)
0.741 (0.439)
Median [Min, Max]
1.00 [0, 1.00]
1.00 [0, 1.00]
1.00 [0, 1.00]
EU_region
Mean (SD)
0.481 (0.501)
0.420 (0.495)
0.450 (0.498)
Median [Min, Max]
0 [0, 1.00]
0 [0, 1.00]
0 [0, 1.00]
AP_region
Mean (SD)
0 (0)
0 (0)
0 (0)
Median [Min, Max]
0 [0, 0]
0 [0, 0]
0 [0, 0]
histology
Mean (SD)
0.344 (0.476)
0.404 (0.492)
0.374 (0.485)
Median [Min, Max]
0 [0, 1.00]
0 [0, 1.00]
0 [0, 1.00]
surgery
Mean (SD)
0.233 (0.424)
0.264 (0.442)
0.249 (0.433)
Median [Min, Max]
0 [0, 1.00]
0 [0, 1.00]
0 [0, 1.00]
ecog1
Mean (SD)
0.587 (0.494)
0.611 (0.489)
0.599 (0.491)
Median [Min, Max]
1.00 [0, 1.00]
1.00 [0, 1.00]
1.00 [0, 1.00]
mAD_CTregimen1
Mean (SD)
0.476 (0.501)
0.487 (0.501)
0.482 (0.500)
Median [Min, Max]
0 [0, 1.00]
0 [0, 1.00]
0 [0, 1.00]
5 Subgroup identification criteria: Options for targeting subgroup effects
5.1 Finding Non-AP subgroup with highest consistency rate (in favor of control)
In the following we evaluate subgroups based on single-factors (e.g., “biomarker < J” say, or “Age <= 65”, etc) and select the subgroup with the highest consistency rate with Cox hazard-ratio estimate meeting selection criteria; The selected subgroup with the highest consistency rate meeting hazard ratio estimate criterion (log hazard-ratio estimate \(\log(\hat\beta) \geq \log(0.90)\), say) where the “consistency rate” is at least \(90\)%.
Here we are seeking to identify subgroups where the hazard ratio estimate is at least \(\log(0.90)\) which may indicate limited benefit. Further, subgroup candidates are ranked according to highest hazard ratio estimate and highest consistency rate.
This is an option, and we discuss alternatives below
To do —> Refer criterion (hazard ratio thresholds) to equations in paper León et al. (2024)
Code
# Show the first (up to) 10 subgroups ("showten_subgroups=TRUE") meeting identification criteria # Subgroups based on only single-factors ("maxk=1")# Note that sg_focus= "hr" sorts the subgroups by consistency rate and largest hazard-ratio estimates # Selects the subgroup with the largest consistency rate; If there are ties among consistency rates, then # the subgroup with largest HR estimate is selected estimate satisfying the (minimum) consisency criteria # (pconsistency.threshold)# Note: we exclude cuts for biomarker which are "<=" in order to only consider "<" fs <-forestsearch(df.analysis=df_nonAP, df.test=df_AP, confounders.name=confounders.name,outcome.name="y.sim",treat.name="treat.sim", event.name="event.sim",potentialOutcome.name="loghr.po",hr.threshold =0.90, hr.consistency =0.80, pconsistency.threshold =0.90,sg_focus ="hr",showten_subgroups=TRUE, details=TRUE,conf_force=c("age <= 65","biomarker < 1","biomarker < 2","biomarker < 3","biomarker < 4","biomarker < 5","biomarker < 6","biomarker < 7","biomarker < 8","biomarker < 9","biomarker < 10"), exclude_cuts =c("biomarker <="),maxk=1, plot.sg=FALSE)if(output) save(fs,file="output/sim3_k1tenhr.Rdata")
Display the first 10 subgroups and Kaplan-Meier plots within the estimated subgroups –
Notice that the first 10 subgroups are sorted by decreasing HR estimates and the selected subgroup corresponds to the highest hazard ratio estimate.
Code
if(loadit) load(file="output/sim3_k1tenhr.Rdata")# Consistency for subgroups contained in "grp.consistency"sg_tables(fs=fs,which_df="est",est_caption="Non-AP (training) estimates")$sg10_out
Subgroups formed by single-factors
maxk=1
M.1
N
E
hr
Pcons
{biomarker < 5}
149
148
1.658330
1.000000
{biomarker < 4}
148
147
1.643559
1.000000
{biomarker < 3}
135
134
1.584584
1.000000
{biomarker < 6}
219
218
1.570647
1.000000
{biomarker < 7}
235
233
1.563982
1.000000
{biomarker < 8}
238
236
1.537168
1.000000
{biomarker < 10}
241
239
1.456963
1.000000
{biomarker < 2}
95
94
1.655311
0.998000
Code
#sg_tables(fs=fs,which_df="est",est_caption="Non-AP (training) estimates",potentialOutcome.name="loghr.po")$tab_estimates# Show detailed estimates for the next analysisid_harm <-paste(fs$sg.harm,collapse=" & ")
The selected subgroup is {biomarker < 5}
Next plot the K-M curves within the estimated non-AP (training) subgroups.
if(loadit) load(file="output/sim3_k1tenmin.Rdata")# Training summariesSG_est <-sg_tables(fs=fs,which_df="est",est_caption="Non-AP (training) estimates",potentialOutcome.name="loghr.po")id_harm <-paste(fs$sg.harm,collapse=" & ")
Top 10 (up to 10) subgroups meeting criteria:
Code
# Top 10 (up to 10) SG'sSG_est$sg10_out
Subgroups formed by single-factors
maxk=1
M.1
N
E
hr
Pcons
{biomarker < 2}
95
94
1.655311
0.998000
{biomarker < 3}
135
134
1.584584
1.000000
{biomarker < 4}
148
147
1.643559
1.000000
{biomarker < 5}
149
148
1.658330
1.000000
{biomarker < 6}
219
218
1.570647
1.000000
{biomarker < 7}
235
233
1.563982
1.000000
{biomarker < 8}
238
236
1.537168
1.000000
{biomarker < 10}
241
239
1.456963
1.000000
Estimates within training (un-adjusted). Note that “AHR(po)” denotes (we refer to the log hazard ratio contrast as “potential-outcome”) denotes equation (8) applied to the respective subgroups in the table (e.g., \({\cal G}\)=ITT).
Code
SG_est$tab_estimates
Non-AP (training) estimates
Subgroup
n
n1
events
m1
m0
RMST
HR (95% CI)
AHR(po)
ITT
382 (100%)
189 (49%)
343 (90%)
11.8
10.7
4.72 (1.58,7.86)
0.68 (0.55,0.84)
0.74
Questionable
95 (25%)
46 (48%)
94 (99%)
6.2
8.2
-3.77 (-7.14,-0.39)
1.66 (1.09,2.52)
2.52
Recommend
287 (75%)
143 (50%)
249 (87%)
16
11.8
7.28 (3.68,10.87)
0.55 (0.42,0.71)
0.49
The selected subgroup is {biomarker < 2}
Next plot the K-M curves within the estimated non-AP subgroups.
Now, if we are only concerned with finding the selected subgroup then the above calculations can be much faster by removing the option to output the first 10 subgroups meeting the selection criteria (“showten_subgroups=FALSE” is the default). This is the default and strongly recommended if running simulations or if there is interest in calculating adjusted hazard-ratio estimates for the selected subgroup. However it is crucial to note that in our applications we are using the non-AP data for subgroup identification and interested in the estimates based on the AP local data. Therefore the non-AP data represents the training dataset and the AP data represents the testing dataset where such adjustment is not necessary (that is, the AP data is not utilized for the subgroup selection).
Here we turn-off showing the first ten subgroups (default in forestsearch) and apply the estimated subgroups to the AP data.
Code
# df.test = df_AP indicates that the selected subgroup identification flags # (based on df_nonAP analysis) will be appended to df_AP# Speed is increased by first sorting by subgroup size and once a subgroup # meets the consistency criterion the overall criteria is met and the algorithm stopsfs <-forestsearch(df.analysis=df_nonAP, df.test=df_AP, confounders.name=confounders.name,outcome.name="y.sim", treat.name="treat.sim", event.name="event.sim",potentialOutcome.name="loghr.po",hr.threshold =0.90, hr.consistency =0.80, pconsistency.threshold =0.90,sg_focus="minSG", details=TRUE,conf_force=c("age <= 65","biomarker < 1","biomarker < 2","biomarker < 3","biomarker < 4","biomarker < 5","biomarker < 6","biomarker < 7","biomarker < 8","biomarker < 9","biomarker < 10"), exclude_cuts=c("biomarker <="), maxk=1)
# of continuous/categorical characteristics 2 7
Continuous characteristics: age biomarker
Categorical characteristics: male EU_region AP_region histology surgery ecog1 mAD_CTregimen1
Default cuts included (1st 20) age <= mean(age) age <= median(age) age <= qlow(age) age <= qhigh(age) biomarker <= mean(biomarker) biomarker <= median(biomarker) biomarker <= qlow(biomarker) biomarker <= qhigh(biomarker)
Categorical: male EU_region AP_region histology surgery ecog1 mAD_CTregimen1
Dropping variables (only 1 level): AP_region
# of candidate subgroup factors= 21
[1] "age <= 65" "biomarker < 1" "biomarker < 2" "biomarker < 3"
[5] "biomarker < 4" "biomarker < 5" "biomarker < 6" "biomarker < 7"
[9] "biomarker < 8" "biomarker < 9" "biomarker < 10" "age <= 59.6"
[13] "age <= 61" "age <= 52" "age <= 68" "male"
[17] "EU_region" "histology" "surgery" "ecog1"
[21] "mAD_CTregimen1"
Number of factors evaluated= 21
Confounders per grf screening q9 q8 q11 q10 q7 q21 q17 q18 q20 q13 q12 q14 q1 q6 q3 q15 q4 q19 q16 q5 q2
Factors Labels VI(grf)
9 biomarker < 8 q9 0.2347
8 biomarker < 7 q8 0.1637
11 biomarker < 10 q11 0.1044
10 biomarker < 9 q10 0.1021
7 biomarker < 6 q7 0.0807
21 mAD_CTregimen1 q21 0.0408
17 EU_region q17 0.0370
18 histology q18 0.0329
20 ecog1 q20 0.0316
13 age <= 61 q13 0.0261
12 age <= 59.6 q12 0.0251
14 age <= 52 q14 0.0195
1 age <= 65 q1 0.0182
6 biomarker < 5 q6 0.0152
3 biomarker < 2 q3 0.0149
15 age <= 68 q15 0.0127
4 biomarker < 3 q4 0.0110
19 surgery q19 0.0109
16 male q16 0.0094
5 biomarker < 4 q5 0.0089
2 biomarker < 1 q2 0.0000
Number of possible configurations (<= maxk): maxk, # <= maxk 1 42
Approximately 5% of max_count met: minutes 8.333333e-05
Approximately 10% of max_count met: minutes 0.0001166667
Approximately 20% of max_count met: minutes 0.0001833333
Approximately 33% of max_count met: minutes 0.0002666667
Approximately 50% of max_count met: minutes 0.00035
Approximately 75% of max_count met: minutes 0.0005166667
Approximately 90% of max_count met: minutes 6e-04
# of subgroups evaluated based on (up to) maxk-factor combinations 42
% of all-possible combinations (<= maxk) 100
k.max= 1
Events criteria for control,exp= 10 10
# of subgroups with events less than criteria: control, experimental 1 1
# of subgroups with sample size less than criteria 1
# of subgroups meeting all criteria = 40
# of subgroups fitted (Cox model estimable) = 40
*Subgroup Searching Minutes=* 0.00065
Number of subgroups meeting HR threshold 10
Subgroup candidate(s) found (FS)
# of candidate subgroups (meeting HR criteria) = 10
SGs (1st 10) meeting screening thresholds sorted by sg_focus= minSG
Kmet (found), Subgroup, % Consistency= 0 !{age <= 68} 0.75
Consistency met!
# of splits, Kmet= 1000 1
Subgroup, % Consistency Met= {biomarker < 2} 0.998
SG focus= minSG
Subgroup Consistency Minutes= 0.042
Subgroup found (FS)
Minutes forestsearch overall= 0.04796667
5.3 Finding the largest non-AP subgroup with strong benefit
Here we are directly targeting the largest subgroup with a strong benefit. This is done by simply reversing the role of treatment so that we seek the largest subgroup for which the hazard ratio is below a clinical threshold (e.g., \(\leq 0.6\)).
if(loadit) load("output/sim3_k1min_null.Rdata")if(is.null(fs$sg.harm)) cat("Subgroup NOT found","\n")
Subgroup NOT found
We have been focusing on identifying subgroups based only on single factors, such as biomarker cuts, or other factors. However, in theory, subgroups effects could involve more than a single factor (In our forestsearch algorithm this is set by maxk=1).
For illustration we now consider selecting the smallest subgroup among 2-factor combinations (maxk=2).
Next we setup simulations to evaluate ability to identify subgroups based on non-AP data and the accuracy to estimate treatment effects in the AP data.
The simulation function mrct_APregion_sims simulates (n_sims times) from specified data-generating-model (dgm):
ITT sample is drawn with subgroups identified (as above)
Cox estimates based on the AP local data;
Subgroups identified based on non-AP and applied to AP
For identified subgroups the Cox model estimates are created (hr_sg)
Note that if a subgroup is NOT found then “subgroups” are taken as the entire AP population
That is, if a subgroup is NOT identified then we interpret the hr_sg as the reported AP estimates which would be the overall AP (since no subgroup is found)
In tables below we also summarize hr_sg_null where we define hr_sg to be missing if no subgroup is found
This represents the AP subgroup estimates conditional on a subgroup being identified
Code
# Simulation function to automatemrct_APregion_sims <-function(dgm,n_sims,wname="AP_region",bw=-log(5),sg_focus="minSG",maxk=1, hr.threshold=0.90, hr.consistency=0.80, pconsistency.threshold=0.9,details=FALSE){t.start<-proc.time()[3]# message for backends# borrowed from simtrials (sim_fixed_n)if (!is(plan(), "sequential")) {# future backendmessage("Using ", nbrOfWorkers(), " cores with backend ", attr(plan("list")[[1]], "class")[2]) } elseif (foreach::getDoParWorkers() >1) {message("Using ", foreach::getDoParWorkers(), " cores with backend ", foreach::getDoParName())message("Warning: ")message("doFuture may exhibit suboptimal performance when using a doParallel backend.") } else {message("Backend uses sequential processing.") }results <-foreach(sim =seq_len(n_sims),.options.future=list(seed=TRUE),.combine="rbind", .errorhandling="pass" ) %dofuture% {# simulate datadfs <-draw_sim_stratified(dgm=dgm,ss=sim,wname=wname,bw=bw,strata_rand="stratum")# Subgroups identified with nonAP populationdf_nonAP <-subset(dfs,AP_region==0)# Applied to APdf_AP <-subset(dfs,AP_region==1)# For subgroup identified estimates # Return NA if not estimable# First, AP results (does not require subgroup identification)# Initialize to missingn_test <-c(nrow(df_AP))n_train <-c(nrow(df_nonAP))n_itt <-c(nrow(dfs))fit <-summary(coxph(Surv(y.sim,event.sim) ~ treat.sim, data=df_nonAP))$conf.inthr_train <-c(fit[1])rm("fit")# Overall (ITT) standard stratified (by randomization strata)fit <-summary(coxph(Surv(y.sim,event.sim) ~ treat.sim +strata(strata.simR), data=dfs))$conf.inthr_itt <-c(fit[1])rm("fit")# Overall (ITT) standard stratified (by X)fit <-summary(coxph(Surv(y.sim,event.sim) ~ treat.sim +strata(AP_region), data=dfs))$conf.inthr_ittX <-c(fit[1])rm("fit")hr_test<-NAany_found <-0.0sg_found <-"none"n_sg <- n_testhr_sg <-NAprev_sg <-1.0# Restrict to AP estimablefitAP <-try(summary(coxph(Surv(y.sim,event.sim) ~ treat.sim, data=df_AP))$conf.int,TRUE)if(!inherits(fitAP,"try-error")){hr_test <-c(fitAP[1])rm("fitAP")# For identified subgroups?# initialize to testingany_found <-0.0sg_found <-"none"n_sg <- n_test prev_sg <-1.0# Prevalence of AP: n_sg/n_allhr_sg <-NA# Potential outcomePOhr_sg <-exp(mean(df_AP$loghr.po))# Note: if not found then n_sg=n_all, hr_sg=hr_all fs_temp <-try(forestsearch(df.analysis=df_nonAP, df.test=df_AP, confounders.name=confounders.name,outcome.name="y.sim", treat.name="treat.sim", event.name="event.sim",potentialOutcome.name="loghr.po",hr.threshold = hr.threshold, hr.consistency = hr.consistency, pconsistency.threshold = pconsistency.threshold,sg_focus=sg_focus, conf_force=c("age <= 65","biomarker < 1","biomarker < 2","biomarker < 3","biomarker < 4","biomarker < 5","biomarker < 6","biomarker < 7","biomarker < 8","biomarker < 9","biomarker < 10"), exclude_cuts =c("biomarker <="), maxk=maxk),TRUE)# FS with error output NA's (set above)# FS without errorif(!inherits(fs_temp,"try-error")){# FS done (w/o error) but NO sg found (replace NAs above with what is calculated)if(is.null(fs_temp$sg.harm)){# consider "sg = AP" any_found <-0.0sg_found <-"none"n_sg <- n_testhr_sg <- hr_testPOhr_sg <-exp(mean(df_AP$loghr.po))prev_sg <-1.0}# If sg foundif(!is.null(fs_temp$sg.harm)){any_found <-1.0df_test <- fs_temp$df.test sg_found <-c(paste(fs_temp$sg.harm,collapse=" & "))if(details & sim <=10) cat("Simulation subgroup found:",c(sim,sg_found),"\n") df_sg <-subset(df_test,treat.recommend==1)n_sg <-c(nrow(df_sg))prev_sg <-c(n_sg/n_test)# Subgroup analysisfitSG <-try(summary(coxph(Surv(y.sim,event.sim)~treat.sim,data=df_sg))$conf.int,TRUE)if(!inherits(fitSG,"try_error") &is.numeric(fitSG[1])) hr_sg <-c(fitSG[1])if(inherits(fitSG,"try_error") |!is.numeric(fitSG[1])) hr_sg <-NAloghr.po <- df_sg$loghr.poPOhr_sg <-exp(mean(loghr.po))rm("fitSG")} } # FS without error donerm("fs_temp")} # AP estimable in first place# Resultsdfres <- data.table::data.table(n_itt,hr_itt,hr_ittX,n_train,hr_train,n_test,hr_test,any_found,sg_found,n_sg,hr_sg,POhr_sg,prev_sg)return(dfres) } t.now<-proc.time()[3] t.min<-(t.now-t.start)/60if(details){cat("Minutes for simulations",c(n_sims,t.min),"\n")cat("Projection per 1000",c(t.min*(1000/n_sims)),"\n")cat("Propn subgroups found =",c(sum(!is.na(results$any_found))/n_sims),"\n") }# Define hr_sg_null as missing if any_found = 0results$hr_sg_null <- results$hr_sgresults$hr_sg_null[results$any_found==0] <-NAreturn(results) }summaryout_mrct <-function(pop_summary,mrct_sims, out_sgs =c("sg_found","sg_biomarker","sg_age"),out_sgs2 =c("sg_biomarker","sg_age","sg_male","sg_ecog1","sg_histology","sg_CTregimen","sg_EU","sg_surgery"),out_est =c("+ reldelta + APflag + sg_le85 + APflag2 + APflag3 + hr_sg_null"),sg_type=1,tab_caption =c("Identified subgroups and estimation summaries under biomarker effects: 1-factor combinations"), showtable=TRUE){outwhat1 <-c("~ hr_itt + hr_ittX + hr_test + found + prev_sg + hr_sg + POhr_sg +")if(sg_type==1) outwhat2 <-paste(out_sgs, collapse="+")if(sg_type==2) outwhat2 <-paste(out_sgs2, collapse="+")out_what <-as.formula(paste(outwhat1,outwhat2,out_est))res <-list()# True causal AHRres$ahr_true =c(round(pop_summary$AHR,4))# AHR Cox un-adjusted (only treatment)res$ahr_unadj =c(round(pop_summary$ITT_unadj,4))# Treatment plus randomization stratificaton sRres$ahr_sR =c(round(pop_summary$ITT_sR,4))# adjustment by wres$ahr_sRw =c(round(pop_summary$ITT_sRw,4))# Relative biases res$bias_unadj <-round(100*c(pop_summary$ITT_unadj-pop_summary$AHR)/pop_summary$AHR,1)res$bias_sR <-round(100*c(pop_summary$ITT_sR-pop_summary$AHR)/pop_summary$AHR,1)res$bias_sRw <-round(100*c(pop_summary$ITT_sRw-pop_summary$AHR)/pop_summary$AHR,1)# The bias within W=1 populationres$ahr_w1 =c(round(pop_summary$W_1,4))res$bias_w1 <-round(100*c(pop_summary$W_1-pop_summary$AHR)/pop_summary$AHR,1)mrct_sims$APflag <-ifelse(mrct_sims$hr_test >0.9,"AP > 0.9","AP <=0.9")mrct_sims$sg_le85 <-ifelse(mrct_sims$hr_sg <=0.85,"AP(sg)<=0.85","AP(sg)>0.85")mrct_sims$APflag2 <-ifelse(mrct_sims$hr_test >0.9& mrct_sims$hr_sg <=0.85,"AP > 0.9 & AP(sg) <= 0.85","Not")mrct_sims$APflag3 <-ifelse(mrct_sims$hr_test >0.9& mrct_sims$hr_sg <=0.80,"AP > 0.9 & AP(sg) <= 0.80","Not")mrct_sims$reldelta <-100*c(mrct_sims$hr_test - mrct_sims$hr_sg)/mrct_sims$hr_testmrct_sims$found <-as.factor(mrct_sims$any_found)# Classify by specific factors identifiedmrct_sims$sg_biomarker <-ifelse(grepl("biomarker",mrct_sims$sg_found),"biomarker",ifelse(mrct_sims$sg_found!="none","other","none"))mrct_sims$sg_age <-ifelse(grepl("age",mrct_sims$sg_found),"age",ifelse(mrct_sims$sg_found!="none","other","none"))mrct_sims$sg_ecog1 <-ifelse(grepl("ecog1",mrct_sims$sg_found),"ecog1",ifelse(mrct_sims$sg_found!="none","other","none"))mrct_sims$sg_histology <-ifelse(grepl("histology",mrct_sims$sg_found),"histology",ifelse(mrct_sims$sg_found!="none","other","none"))mrct_sims$sg_CTregimen <-ifelse(grepl("mAD_CTregimen1",mrct_sims$sg_found),"mAD_CT",ifelse(mrct_sims$sg_found!="none","other","none"))mrct_sims$sg_male <-ifelse(grepl("male",mrct_sims$sg_found),"male",ifelse(mrct_sims$sg_found!="none","other","none"))mrct_sims$sg_surgery <-ifelse(grepl("surgery",mrct_sims$sg_found),"surgery",ifelse(mrct_sims$sg_found!="none","other","none"))mrct_sims$sg_EU <-ifelse(grepl("EU_region",mrct_sims$sg_found),"EU region",ifelse(mrct_sims$sg_found!="none","other","none"))out_table <-table1(out_what, data=mrct_sims, caption=tab_caption)if(showtable) print(out_table)return(list(res=res,out_table=out_table))}SGplot_estimates <-function(df,label_training="Training",label_testing="Testing", label_itt="ITT (sR)", label_sg="Testing(subgroup)"){df_itt <-data.table(est=df$hr_itt, analysis=label_itt)df_training <-data.table(est=df$hr_train, analysis=label_training)df_testing <-data.table(est=df$hr_test, analysis=label_testing)df_sg <-data.table(est=df$hr_sg, analysis=label_sg)hr_estimates <-rbind(df_itt, df_training, df_testing, df_sg) est_order <-c(label_itt,label_training,label_testing,label_sg)hr_estimates <-within(hr_estimates, {analysis <-factor(analysis,levels=est_order)})p <-ggplot(hr_estimates, aes(x=analysis, y=est, fill=analysis)) +geom_violin(trim=FALSE)plot_estimates <- p +geom_boxplot(width=0.1)return(list(dfPlot_estimates=hr_estimates, plot_estimates=plot_estimates))}
# Run sims and store resultst.start<-proc.time()[3]# DGM #log.hrs <- log(c(2.5,1.25,0.50))# example 2log.hrs <-log(c(3,1.25,0.50))dgm <-get_dgm_stratified(df=df.case,zeta=10,knot=5,log.hrs=log.hrs,strata_tte=NULL,details=TRUE)pop_summary <-draw_sim_stratified(dgm=dgm,ss=1,Ndraw=100000,wname="AP_region",bw=-log(5),strata_rand="stratum",checking=FALSE,details=FALSE,return_df=FALSE)# True causal AHRahr_true =c(round(pop_summary$AHR,4))# AHR Cox un-adjusted (only treatment)ahr_unadj =c(round(pop_summary$ITT_unadj,4))# Treatment plus randomization stratificaton sRahr_sR =c(round(pop_summary$ITT_sR,4))# sR including adjustment by xahr_sRw =c(round(pop_summary$ITT_sRw,4))# Relative biases bias_unadj <-round(100*c(pop_summary$ITT_unadj-pop_summary$AHR)/pop_summary$AHR,1)bias_sR <-round(100*c(pop_summary$ITT_sR-pop_summary$AHR)/pop_summary$AHR,1)bias_sRw <-round(100*c(pop_summary$ITT_sRw-pop_summary$AHR)/pop_summary$AHR,1)# The bias within W=1 populationahr_w1 =c(round(pop_summary$W_1,4))bias_w1 <-round(100*c(pop_summary$W_1-pop_summary$AHR)/pop_summary$AHR,1)cat("AHR true (causal) and within AP:",c(ahr_true,ahr_w1),"\n")n_sims <- nsimslibrary(doFuture)library(doRNG)registerDoFuture()registerDoRNG()mrct_sims <-mrct_APregion_sims(dgm=dgm,n_sims=n_sims,wname="AP_region",bw=-log(5),sg_focus="minSG",details=TRUE)t.now<-proc.time()[3]t.min<-(t.now-t.start)/60cat("Minutes (doFuture) simulations",c(t.min),"\n")cat("How often was a subgroup identified=",c(mean(mrct_sims$any_found)),"\n")if(output) save(n_sims,pop_summary,dgm,mrct_sims,file="output/mrct_sims3_v0.Rdata")
Code
if(loadit) load(file="output/mrct_sims3_v0.Rdata")# Look at large-sample spline in non-APdflarge <-draw_sim_stratified(dgm=dgm,ss=1,Ndraw=10000,wname="AP_region",bw=-log(5),strata_rand="stratum",checking=FALSE,details=FALSE,return_df=TRUE)temp <-cox_cs_fit(df=subset(dflarge,AP_region==0),tte.name="y.sim",event.name="event.sim",treat.name="treat.sim",strata.name="strata.simO",z.name=c("z"), details=FALSE ,boots=0, xlab=c("Biomarker"),maxz =25, ydel=0.5, cex_legend=0.5, show_plot=TRUE,truebeta.name="loghr.po",main_title=c("Non-AP large sample ~ training estimation properties"))
6.2 Null model where benefit is uniform (prognostic effect bw=log5)
Simulations under the null where treatment benefit is uniform; Treatment does not depend on any baseline factors (hazard-ratio is 0.70)
Code
# Increase simulations to 5,000 under "null"t.start<-proc.time()[3]# Null uniform benefit hr=0.7log.hrs <-log(c(0.7,0.7,0.7))dgm <-get_dgm_stratified(df=df.case,zeta=10,knot=5,log.hrs=log.hrs,strata_tte=NULL,details=TRUE)pop_summary <-draw_sim_stratified(dgm=dgm,ss=1,Ndraw=100000,wname="AP_region",bw=-log(5),strata_rand="stratum",checking=FALSE,details=FALSE,return_df=FALSE)# True causal AHRahr_true =c(round(pop_summary$AHR,4))# AHR Cox un-adjusted (only treatment)ahr_unadj =c(round(pop_summary$ITT_unadj,4))# Treatment plus randomization stratificaton sRahr_sR =c(round(pop_summary$ITT_sR,4))# sR including adjustment by xahr_sRw =c(round(pop_summary$ITT_sRw,4))# Relative biases bias_unadj <-round(100*c(pop_summary$ITT_unadj-pop_summary$AHR)/pop_summary$AHR,1)bias_sR <-round(100*c(pop_summary$ITT_sR-pop_summary$AHR)/pop_summary$AHR,1)bias_sRw <-round(100*c(pop_summary$ITT_sRw-pop_summary$AHR)/pop_summary$AHR,1)# The bias within X=1 populationahr_w1 =c(round(pop_summary$W_1,4))bias_w1 <-round(100*c(pop_summary$W_1-pop_summary$AHR)/pop_summary$AHR,1)cat("AHR true (causal) and within AP:",c(ahr_true,ahr_w1),"\n")n_sims <- nsimslibrary(doFuture)library(doRNG)registerDoFuture()registerDoRNG()mrct_sims <-mrct_APregion_sims(dgm=dgm,n_sims=n_sims,wname="AP_region",bw=-log(5),sg_focus="minSG",details=TRUE)t.now<-proc.time()[3]t.min<-(t.now-t.start)/60cat("Minutes (doFuture) simulations",c(t.min),"\n")cat("How often was a subgroup identified=",c(mean(mrct_sims$any_found)),"\n")if(output) save(n_sims,pop_summary,dgm,mrct_sims,file="output/mrct_sims03_v0.Rdata")
Code
if(loadit) load(file="output/mrct_sims03_v0.Rdata")false_pos <-round(100*c(mean(mrct_sims$any_found)),4)# Look at large-sample spline in non-APdflarge <-draw_sim_stratified(dgm=dgm,ss=1,Ndraw=10000,wname="AP_region",bw=-log(5),strata_rand="stratum",checking=FALSE,details=FALSE,return_df=TRUE)temp <-cox_cs_fit(df=subset(dflarge,AP_region==0),tte.name="y.sim",event.name="event.sim",treat.name="treat.sim",strata.name="strata.simO",z.name=c("z"), details=FALSE ,boots=0, xlab=c("Biomarker"),maxz =25, ydel=0.5, cex_legend=0.5, show_plot=TRUE,truebeta.name="loghr.po",main_title=c("Non-AP large sample ~ training estimation properties"))
Code
# Note, here too many individual subgroup cuts, set sg_type=2temp <-summaryout_mrct(pop_summary=pop_summary, mrct_sims=mrct_sims,showtable=FALSE, sg_type=2, tab_caption=c("Identified subgroups and estimation summaries under uniform treatment effects: 1-factor combinations"))ahr_true <-c(temp$res$ahr_true)ahr_unadj <-c(temp$res$ahr_unadj)ahr_sR <-c(temp$res$ahr_sR)ahr_sRw <-c(temp$res$ahr_sRw)ahr_w1 <-c(temp$res$ahr_w1)
The underlying true ITT causal hazard ratio is 0.7
The large-sample (single draw) approximation to ITT un-adjusted is 0.7619
The large-sample ~ to ITT stratified is 0.706
The large-sample ~ to ITT stratified + x-adjustment is 0.7056
The large-sample ~ to AP region analysis is 0.7173
The false-positive rate (identifying a subgroup under uniform benefit)
Across 1,000 simulations (a subgroup was falsely identified) = 9.9%
Summary of subgroups identified and estimation properties are:
Code
# Show the tabletemp$out_table
Identified subgroups and estimation summaries under uniform treatment effects: 1-factor combinations
6.3 Null model where benefit is uniform (prognostic effect bw=-log5): maxk=2
Simulations under the null where treatment benefit is uniform (maxk=2); Treatment does not depend on any baseline factors (hazard-ratio is 0.7)
Note: Allowing for 2-factor combinations may require more stringent hazard-ratio thresholds; Here we increase to 1.25 (1.0) as in paper
Code
t.start<-proc.time()[3]# Null uniform benefit hr=0.7log.hrs <-log(c(0.7,0.7,0.7))dgm <-get_dgm_stratified(df=df.case,zeta=10,knot=5,log.hrs=log.hrs,strata_tte=NULL,details=TRUE)pop_summary <-draw_sim_stratified(dgm=dgm,ss=1,Ndraw=100000,wname="AP_region",bw=-log(5),strata_rand="stratum",checking=FALSE,details=FALSE,return_df=FALSE)# True causal AHRahr_true =c(round(pop_summary$AHR,4))# AHR Cox un-adjusted (only treatment)ahr_unadj =c(round(pop_summary$ITT_unadj,4))# Treatment plus randomization stratificaton sRahr_sR =c(round(pop_summary$ITT_sR,4))# sR including adjustment by xahr_sRw =c(round(pop_summary$ITT_sRw,4))# Relative biases bias_unadj <-round(100*c(pop_summary$ITT_unadj-pop_summary$AHR)/pop_summary$AHR,1)bias_sR <-round(100*c(pop_summary$ITT_sR-pop_summary$AHR)/pop_summary$AHR,1)bias_sRw <-round(100*c(pop_summary$ITT_sRw-pop_summary$AHR)/pop_summary$AHR,1)# The bias within X=1 populationahr_w1 =c(round(pop_summary$W_1,4))bias_w1 <-round(100*c(pop_summary$W_1-pop_summary$AHR)/pop_summary$AHR,1)cat("AHR true (causal) and within AP:",c(ahr_true,ahr_w1),"\n")n_sims <- nsimslibrary(doFuture)library(doRNG)registerDoFuture()registerDoRNG()mrct_sims <-mrct_APregion_sims(dgm=dgm,n_sims=n_sims,wname="AP_region",bw=-log(5),sg_focus="hr", maxk=2, hr.threshold=1.25, hr.consistency=1.0, details=TRUE)t.now<-proc.time()[3]t.min<-(t.now-t.start)/60cat("Minutes (doFuture) simulations",c(t.min),"\n")cat("How often was a subgroup identified=",c(mean(mrct_sims$any_found)),"\n")if(output) save(n_sims,pop_summary,dgm,mrct_sims,file="output/mrct_sims03_k2_v0.Rdata")
Code
if(loadit) load(file="output/mrct_sims03_k2_v0.Rdata")false_pos <-round(100*c(mean(mrct_sims$any_found)),4)cat("What do a few non-null look like","\n")
# Look at large-sample spline in non-AP# This is same as above so do not repeat# dflarge <- draw_sim_stratified(dgm=dgm,ss=1,Ndraw=10000,xname="AP_region",bx=log(5),strata_rand="stratum",checking=FALSE,details=FALSE,return_df=TRUE)# temp <- cox_cs_fit(df=subset(dflarge,AP_region==0),tte.name="y.sim",event.name="event.sim",treat.name="treat.sim",# strata.name="strata.simO",z.name=c("z"), details=FALSE ,boots=0, xlab=c("Biomarker"),maxz = 25, ydel=0.5, cex_legend=0.5, show_plot=TRUE,truebeta.name="loghr.po",# main_title=c("Non-AP large sample ~ training estimation properties"))# Note, here too many individual subgroup cuts, set sg_type=2temp <-summaryout_mrct(pop_summary=pop_summary, mrct_sims=mrct_sims,showtable=FALSE, sg_type=2, tab_caption=c("Identified subgroups and estimation summaries under uniform treatment effects: 2-factor combinations"))ahr_true <-c(temp$res$ahr_true)ahr_unadj <-c(temp$res$ahr_unadj)ahr_sR <-c(temp$res$ahr_sR)ahr_sRw <-c(temp$res$ahr_sRw)ahr_w1 <-c(temp$res$ahr_w1)
The underlying true ITT causal hazard ratio is 0.7
The large-sample (single draw) approximation to ITT un-adjusted is 0.7619
The large-sample ~ to ITT stratified is 0.706
The large-sample ~ to ITT stratified + x-adjustment is 0.7056
The large-sample ~ to AP region analysis is 0.7173
The false-positive rate (identifying a subgroup under uniform benefit)
Across 1,000 simulations (a subgroup was falsely identified) = 7.6%
Summary of subgroups identified and estimation properties are:
Code
# Show the tabletemp$out_table
Identified subgroups and estimation summaries under uniform treatment effects: 2-factor combinations
# Note, here too many individual subgroup cuts, set sg_type=2temp <-summaryout_mrct(pop_summary=pop_summary, mrct_sims=mrct_sims,showtable=FALSE, sg_type=2, tab_caption=c("Identified subgroups and estimation summaries under biomarker effects: 2-factor combinations"))ahr_true <-c(temp$res$ahr_true)ahr_unadj <-c(temp$res$ahr_unadj)ahr_sR <-c(temp$res$ahr_sR)ahr_sRw <-c(temp$res$ahr_sRw)ahr_w1 <-c(temp$res$ahr_w1)
The underlying true ITT causal hazard ratio is 0.7368
The large-sample (single draw) approximation to ITT un-adjusted is 0.7904
The large-sample ~ to ITT stratified is 0.7641
The large-sample ~ to ITT stratified + x-adjustment is 0.7458
The large-sample ~ to AP region analysis is 1.016
The true-positive rate (identifying a subgroup under strong biomarker effects)
Across 1,000 simulations (a subgroup was identified) = 100%
Summary of subgroups identified and estimation properties are:
Code
# Show the tabletemp$out_table
Identified subgroups and estimation summaries under biomarker effects: 2-factor combinations
Simulations under strong biomarker effect but null prognostic effect
Code
# Run sims and store resultst.start<-proc.time()[3]# DGM #log.hrs <- log(c(2.5,1.25,0.50))# example 2log.hrs <-log(c(3,1.25,0.50))dgm <-get_dgm_stratified(df=df.case,zeta=10,knot=5,log.hrs=log.hrs,strata_tte=NULL,details=TRUE)pop_summary <-draw_sim_stratified(dgm=dgm,ss=1,Ndraw=100000,wname="AP_region",bw=0.0,strata_rand="stratum",checking=FALSE,details=FALSE,return_df=FALSE)# True causal AHRahr_true =c(round(pop_summary$AHR,4))# AHR Cox un-adjusted (only treatment)ahr_unadj =c(round(pop_summary$ITT_unadj,4))# Treatment plus randomization stratificaton sRahr_sR =c(round(pop_summary$ITT_sR,4))# sR including adjustment by xahr_sRw =c(round(pop_summary$ITT_sRw,4))# Relative biases bias_unadj <-round(100*c(pop_summary$ITT_unadj-pop_summary$AHR)/pop_summary$AHR,1)bias_sR <-round(100*c(pop_summary$ITT_sR-pop_summary$AHR)/pop_summary$AHR,1)bias_sRw <-round(100*c(pop_summary$ITT_sRw-pop_summary$AHR)/pop_summary$AHR,1)# The bias within W=1 populationahr_w1 =c(round(pop_summary$W_1,4))bias_w1 <-round(100*c(pop_summary$W_1-pop_summary$AHR)/pop_summary$AHR,1)cat("AHR true (causal) and within AP:",c(ahr_true,ahr_w1),"\n")n_sims <- nsimslibrary(doFuture)library(doRNG)registerDoFuture()registerDoRNG()mrct_sims <-mrct_APregion_sims(dgm=dgm,n_sims=n_sims,wname="AP_region",bw=0.0,sg_focus="minSG",details=TRUE)t.now<-proc.time()[3]t.min<-(t.now-t.start)/60cat("Minutes (doFuture) simulations",c(t.min),"\n")cat("How often was a subgroup identified=",c(mean(mrct_sims$any_found)),"\n")if(output) save(n_sims,pop_summary,dgm,mrct_sims,file="output/mrct_sims3Wnull_v0.Rdata")
Code
if(loadit) load(file="output/mrct_sims3Wnull_v0.Rdata")# Look at large-sample spline in non-APdflarge <-draw_sim_stratified(dgm=dgm,ss=1,Ndraw=10000,wname="AP_region",bw=0.0,strata_rand="stratum",checking=FALSE,details=FALSE,return_df=TRUE)temp <-cox_cs_fit(df=subset(dflarge,AP_region==0),tte.name="y.sim",event.name="event.sim",treat.name="treat.sim",strata.name="strata.simO",z.name=c("z"), details=FALSE ,boots=0, xlab=c("Biomarker"),maxz =25, ydel=0.5, cex_legend=0.5, show_plot=TRUE,truebeta.name="loghr.po",main_title=c("Non-AP large sample ~ training estimation properties"))
Subgroups (SGs) with HR estimates indicative of “sub-optimal benefit” (e.g., \(hr \geq 1.0\) threshold), are considered candidates with a “splitting consistency criteria” for selection1.
In essence, if a subgroup that appears to derive the least benefit is identified, then the complement may potentially be considered to derive benefit with a “higher degree of confidence” relative to the ITT population
SGs are formed by single factors (eg, SG1 = males, SG2 = age<65) and two-way combinations (e.g., SG3 = males & age<65)
SGs with a minimum size of \(60\) and with a minimum of number of 10 events in each treatment arm will be considered
Reversing the role of treatment allows for identifying subgroups with “enhanced benefit” (e.g., \(hr \leq 0.65\) threshold)
Continuous factors are cut at either medians (q2), or quartiles (q1, q2, q3, say)
Cuts are also identified by generalized random forests (GRF) which is an identification approach itself –
Generalized random forests (Athey, Tibshirani, and Wager (2019), Cui et al. (2023)) which is based on restricted mean survival time (RMST) comparisons as implemented in the R package Tibshirani et al. (2022):
An RMST benefit of (at least) \(3\) months for control is required for selection of a subgroup, where among tree depths of 1 and 2 the subgroup with the largest RMST benefit (\(\geq 3\) months in favor of control) is selected
Similar to forestSearch, we are targeting relatively large effects
SGs with sample sizes of at least \(60\) participants and a maximum tree depth of 2 is required
7.2 Example of bootstrap adjusted estimates and CIs
Example of bootstrap de-biased hazard-ratio (point) and CI estimation. Note that this is relevant for inference with regard to the training population – That is, we are interested in valid estimation when based on the same population from which subgroups were identified. For example, if the AP data will also be used for identifying the subgroups then we would want to account for the algorithm used in identification. Here we illustrate with the non-AP data.
tALL.now<-proc.time()[3]t.min<-(tALL.now-tALL.start)/60cat("Minutes for ALL Analyses",c(t.min),"\n")
Minutes for ALL Analyses 0.2235167
References
Aalen, O. O., R. J. Cook, and K. Røysland. 2015. “Does Cox Analysis of a Randomized Survival Study Yield a Causal Treatment Effect?”Lifetime Data Analysis, 579–93.
Athey, Susan, Julie Tibshirani, and Stefan Wager. 2019. “Generalized Random Forests.”The Annals of Statistics 47 (2): 1148–78.
Cui, Yifan, Michael R Kosorok, Erik Sverdrup, Stefan Wager, and Ruoqing Zhu. 2023. “Estimating heterogeneous treatment effects with right-censored data via causal survival forests.”Journal of the Royal Statistical Society Series B: Statistical Methodology, February.
León, Larry F., Thomas Jemielita, Zifang Guo, Rachel Marceau West, and Keaven M. Anderson. 2024. “Exploratory Subgroup Identification in the Heterogeneous Cox Model: A Relatively Simple Procedure.”Statistics in Medicine 43 (20): 3921–42. https://doi.org/https://doi.org/10.1002/sim.10163.
Tibshirani, Julie, Susan Athey, Rina Friedberg, Vitor Hadad, David Hirshberg, Luke Miner, Erik Sverdrup, Stefan Wager, and Marvin Wright. 2022. “Package Grf.”
Footnotes
“splitting consistency criteria” judges the robustness/realness↩︎